Mining Generalized Association Rules

نویسندگان

  • Ramakrishnan Srikant
  • Rakesh Agrawal
چکیده

We introduce the problem of mining generalized association rules. Given a large database of transactions, where each transaction consists of a set of items, and a taxonomy (is-a hierarchy) on the items, we find associations between items at any level of the taxonomy. For example, given a taxonomy that says that jackets is-a outerwear is-e clothes, we may infer a rule that “people who buy outerwear tend to buy shoes”. This rule may hold even if rules that “people who buy jackets tend to buy shoes”, and “people who buy clothes tend to buy shoes” do not hold. An obvious solution to the problem is to add all ancestors of each item in a transaction to the transaction, and then run any of the algorithms for mining association rules on these “extended transactions” . However, this “Basic” algorithm is not very fast; we present two algorithms, Cumulate and EstMerge, which run 2 to 5 times faster than Basic (and more than 100 times faster on one real-life dataset). We also present a new interest-measure for rules which uses the information in the taxonomy. Given a user-specified “minimum-interest-level”, this measure prunes a large number of redundant rules; 40% to 60% of all the rules were pruned on two real-life datasets. *Also, Department of Computer Science, University of Wisconsin, Madison. Permission to copy without fee all OT part of this material is granted provided that the copies are not made OT distrib&ed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying ia by permission of the Very Large Data Base Endowment. To copy otherwise, OT to republish, requires a fee and/or special pcTmiasion from the Endowment. Proceedings of the 21st VLDB Conference Zurich, Swizerland, 1995

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The fuzzy data mining generalized association rules for quantitative values

Due to the increasing use of very large databases and data warehouses, mining useful information and helpful knowledge from transactions is evolving into an important research area. Most conventional data-mining algorithms identify the relationships among transactions using binary values and find rules at a single concept level. Transactions with quantitative values and items with hierarchy rel...

متن کامل

Preknowledge-based generalized association rules mining

The subject of this paper is the mining of generalized association rules using pruning techniques. Given a large transaction database and a hierarchical taxonomy tree of the items, we attempt to find the association rules between the items at different levels in the taxonomy tree under the assumption that original frequent itemsets and association rules have already been generated in advance. T...

متن کامل

Obtaining and Evaluating Generalized Association Rules

Generalized association rules are rules that contain some background knowledge giving a more general view of the domain. This knowledge is codified by a taxonomy set over the data set items. Many researches use taxonomies in different data mining steps to obtain generalized rules. So, this work initially presents an approach to obtain generalized association rules in the post-processing data mi...

متن کامل

Maintenance of Generalized Association Rules Under Transaction Update and Taxonomy Evolution

Mining generalized association rules among items in the presence of taxonomies has been recognized as an important model in data mining. Earlier work on mining generalized association rules ignore the fact that the taxonomies of items cannot be kept static while new transactions are continuously added into the original database. How to effectively update the discovered generalized association r...

متن کامل

Incremental maintenance of generalized association rules under taxonomy evolution

Mining association rules from large databases of business data is an important topic in data mining. In many applications, there are explicit or implicit taxonomies (hierarchies) for items, so it may be useful to find associations at levels of the taxonomy other than the primitive concept level. Previous work on the mining of generalized association rules, however, assumed that the taxonomy of ...

متن کامل

A New Algorithm for Faster Mining of Generalized Association Rules

Generalized association rules are a very important extension of boolean association rules, but with current approaches mining generalized rules is computationally very expensive. Especially when considering the rule generation as being part of an interactive KDD-process this becomes annoying. In this paper we discuss strengths and weaknesses of known approaches to generate frequent itemsets. Ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 13  شماره 

صفحات  -

تاریخ انتشار 1995